Very deep convolutional neural networks for LVCSR
نویسندگان
چکیده
Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance when embedded in large vocabulary continuous speech recognition (LVCSR) systems due to its capability of modeling local correlations and reducing translational variations. In all previous related works for ASR, only up to two convolutional layers are employed. In light of the recent success of very deep CNNs in image classification, it is of interest to investigate the deep structure of CNNs for speech recognition in detail. In contrast to image classification, the dimensionality of the speech feature, the span size of input feature and the relationship between temporal and spectral domain are new factors to consider while designing very deep CNNs. In this work, very deep CNNs are introduced for LVCSR task, by extending depth of convolutional layers up to ten. The contribution of this work is two-fold: performance improvement of very deep CNNs is investigated under different configurations; further, a better way to perform convolution operations on temporal dimension is proposed. Experiments showed that very deep CNNs offer a 8-12% relative improvement over baseline DNN system, and a 4-7% relative improvement over baseline CNN system, evaluated on both a 15-hr Callhome and a 51-hr Switchboard LVCSR tasks.
منابع مشابه
Provide a Deep Convolutional Neural Network Optimized with Morphological Filters to Map Trees in Urban Environments Using Aerial Imagery
Today, we cannot ignore the role of trees in the quality of human life, so that the earth is inconceivable for humans without the presence of trees. In addition to their natural role, urban trees are also very important in terms of visual beauty. Aerial imagery using unmanned platforms with very high spatial resolution is available today. Convolutional neural networks based deep learning method...
متن کاملEstimation of Hand Skeletal Postures by Using Deep Convolutional Neural Networks
Hand posture estimation attracts researchers because of its many applications. Hand posture recognition systems simulate the hand postures by using mathematical algorithms. Convolutional neural networks have provided the best results in the hand posture recognition so far. In this paper, we propose a new method to estimate the hand skeletal posture by using deep convolutional neural networks. T...
متن کاملCystoscopy Image Classication Using Deep Convolutional Neural Networks
In the past three decades, the use of smart methods in medical diagnostic systems has attractedthe attention of many researchers. However, no smart activity has been provided in the eld ofmedical image processing for diagnosis of bladder cancer through cystoscopy images despite the highprevalence in the world. In this paper, two well-known convolutional neural networks (CNNs) ...
متن کاملA multi-scale convolutional neural network for automatic cloud and cloud shadow detection from Gaofen-1 images
The reconstruction of the information contaminated by cloud and cloud shadow is an important step in pre-processing of high-resolution satellite images. The cloud and cloud shadow automatic segmentation could be the first step in the process of reconstructing the information contaminated by cloud and cloud shadow. This stage is a remarkable challenge due to the relatively inefficient performanc...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کامل